Urban Efficiency vs. Suburban Scarcity
Using the latest SEPTA General Transit Feed Specification (GTFS)
data, it
reveals that a staggering 80% of SEPTA’s trips are made by bus,
showcasing the critical role of bus transit in Philadelphia’s
transportation network. This statistic underlines the heavy reliance on
buses across the city, not just in suburban areas.
# Import GTFS data
stops <- read.csv("data/google_bus/stops.txt", header = TRUE, sep = ",")
stop_times <- read.csv("data/google_bus/stop_times.txt", header = TRUE, sep = ",")
trips <- read.csv("data/google_bus/trips.txt", header = TRUE, sep = ",")
shapes <- read.csv("data/google_bus/shapes.txt", header = TRUE, sep = ",")
routes <- read.csv("data/google_bus/routes.txt", header = TRUE, sep = ",")
calendar <- read.csv("data/google_bus/calendar.txt", header = TRUE, sep = ",")
calendar_dates <- read.csv("data/google_bus/calendar_dates.txt", header = TRUE, sep = ",")
# Import Swiftly data
swiftly <- st_read("data/SwiftlyShapes/geoJSONs/SEPTASurface_HighResSpeed_v2.json")
# Join stop_times and trips, from now on dat will be our main data set
dat <- left_join(stop_times, trips, by = "trip_id")
# join dat and routes
dat <- left_join(dat, routes, by = "route_id")
#determine route_type
dat <- dat %>%
mutate(mode = if_else(route_type == 0, "Trolley", ifelse(route_type == 1, "Metro", ifelse(route_type == 3, "Bus", "Trolleybus"))))
# Mode share (not ridership)
dat_trip_by_mode <- dat %>%
group_by(mode) %>%
summarise(mode_count = n(), .groups = 'drop') %>%
mutate(total_count = sum(mode_count),
mode_share = round(mode_count/sum(mode_count)*100, digits = 2)) %>%
ungroup()
# Plot mode share using ggplot2 and the custom palette
ggplot(dat_trip_by_mode, aes(x = mode, y = mode_share, fill = mode)) +
geom_bar(stat = "identity", width = 0.75) +
scale_fill_manual(values = bus_revolution_palette(length(unique(dat_trip_by_mode$mode)))) +
geom_text(aes(label = paste0(mode_share, "% (", mode_count, " trips)")),
vjust = -0.5, color = "black", size = 2.5) +
labs(title = "Mode Share by Transit Type",
x = "Mode",
y = "Mode Share (%)",
fill = "Mode") +
theme_minimal() +
theme(legend.position = "right",
plot.title = element_text(hjust = 0.5))

# total trip bus & trolley bus 1.980.505, total observations 2.026.677
The data also highlights distinct patterns in ridership, with bus
usage peaking during morning (6-9AM) and evening (3-6PM) rush hours on
weekdays. This contrasts with weekends, where the pattern flattens,
indicating a more even distribution of trips throughout the day.
# CREATE LINE GRAPH OF TRIPS BY SERVICE PERIODS
# Function to normalize times greater than 24:00:00. Important step: check the stop_times$arrival_time and departure_time values if it's greater than 24:00
normalize_time <- function(time_str) {
time_parts <- strsplit(time_str, ":")[[1]]
hours <- as.numeric(time_parts[1])
minutes <- as.numeric(time_parts[2])
seconds <- as.numeric(time_parts[3])
if (hours >= 24) {
hours <- hours - 24
time_str <- sprintf("%02d:%02d:%02d", hours, minutes, seconds)
}
return(time_str)
}
#identify peak_pm category
dat <- dat %>%
mutate(departure_time = sapply(departure_time, normalize_time), # departure_time's normalized
departure_time = lubridate::hms(departure_time),
hour = hour(departure_time), # Extract hour component for easier condition checking
peak_category = if_else(hour >= 4 & hour < 6, "Early Morning",
if_else(hour >= 6 & hour < 9, "AM Peak",
if_else(hour >= 9 & hour < 15, "Mid Day",
if_else(hour >= 15 & hour < 18, "PM Peak",
if_else(hour >= 18 & hour < 22, "Evening",
ifelse(hour >= 22 & hour < 24, "Late Night", "Owl")))))))
# # Check if there are any NA values in the 'departure_time' column
# summarise(dat,
# na_count = sum(is.na(departure_time)), #Count NA values
# total_rows = n(), # Total number of rows for context
# percentage_na = na_count / total_rows * 100 # Percentage of NA values
# )
durations <- c("Owl" = 4, "Early Morning" = 2, "AM Peak" = 3, "Mid Day" = 6, "PM Peak" = 3, "Evening" = 4, "Late Night" = 2)
dat_service_period <- dat %>%
filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
group_by(peak_category)%>%
summarise(peak_count = n(), .groups = 'drop') %>%
mutate(total_count = sum(peak_count),
peak_share = round(peak_count/sum(peak_count)*100, digits = 2)) %>%
arrange(factor(peak_category, levels = c("Owl", "Early Morning", "AM Peak", "Mid Day", "PM Peak", "Evening", "Late Night"))) %>%
ungroup()
# Create a data frame for service periods
service_periods <- data.frame(
peak_category = c("Owl", "Early Morning", "AM Peak", "Mid Day", "PM Peak", "Evening", "Late Night"),
start_hour = c(0, 4, 6, 9, 15, 18, 22),
end_hour = c(4, 6, 9, 15, 18, 22, 24)
)
# CREATE BAR CHART OF TRIPS BY SERVICE TYPES
# Ensure 'calendar' has 'day_type' defined
calendar <- calendar %>%
mutate(
day_type = case_when(
monday == 1 & tuesday == 1 & wednesday == 1 & thursday == 1 & friday == 1 & saturday == 0 & sunday == 0 ~ "Weekday",
(saturday == 1 | sunday == 1) ~ "Weekend",
TRUE ~ "irregular"
)
)
# Calendar_dates includes 'service_id', 'date', and 'exception_type'
# Ensure 'calendar_dates' has 'day_type' defined based on the date
calendar_dates <- calendar_dates %>%
filter(exception_type == 1) %>% #we only need exception_type == 1, those added to the service
mutate(
date = ymd(date), # Convert date format if necessary
day_of_week = wday(date, label = TRUE, week_start = 1),
day_type = case_when(
day_of_week %in% c("Mon", "Tue", "Wed", "Thu", "Fri") ~ "Weekday", # Monday to Friday
TRUE ~ "Weekend" # Saturday and Sunday
)
)
# Join calendar_dates with calendar
calendar_joined <- calendar_dates %>%
full_join(calendar, by = "service_id", suffix = c(".dates", ".cal")) %>%
mutate(
final_day_type = case_when(
# Use calendar_dates' day_type if exception_type indicates added service
exception_type == 1 & day_type.dates == "Weekday" ~ "Weekday",
exception_type == 1 & day_type.dates == "Weekend" ~ "Weekend",
# Default to calendar's day_type where there's no specific exception noted
TRUE ~ day_type.cal
)
) %>%
select(service_id, final_day_type) %>%
group_by(service_id, final_day_type) %>%
summarise(count = n(), .groups = 'drop') %>%
arrange(service_id, desc(count)) %>%
distinct(service_id, .keep_all = TRUE) %>% # Make sure no double day types for service ID
select(-count)
dat <- dat %>%
left_join(calendar_joined, by = "service_id")
dat_service_type <- dat %>%
filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
group_by(final_day_type)%>%
summarise(day_type_count = n(), .groups = 'drop') %>%
mutate(total_count = sum(day_type_count),
peak_share = round(day_type_count/sum(day_type_count)*100, digits = 2)) %>%
arrange(factor(final_day_type, levels = c("Weekday", "Weekend"))) %>%
ungroup()
# Summarize data to get the number of trips per hour
hourly_trips <- dat %>%
filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
group_by(hour, final_day_type) %>%
summarise(number_of_trips = n())
# Define a custom palette for the service periods
service_period_colors <- c(
"Owl" = "#E5E5E5",
"Early Morning" = "#C5E0B4",
"AM Peak" = "#A9D08E",
"Mid Day" = "#FFD966",
"PM Peak" = "#F4B084",
"Evening" = "#D5A6BD",
"Late Night" = "#A4C2F4"
)
# Plot the combined graph
ggplot() +
# Add service period background rectangles
geom_rect(data = service_periods, aes(xmin = start_hour, xmax = end_hour, ymin = 0, ymax = max(hourly_trips$number_of_trips), fill = peak_category), alpha = 0.3) +
# Add the line plot for weekday trips
geom_line(data = hourly_trips %>% filter(final_day_type == "Weekday"), aes(x = hour, y = number_of_trips), color = "#F2AE2E", size = 1) +
geom_point(data = hourly_trips %>% filter(final_day_type == "Weekday"), aes(x = hour, y = number_of_trips), color = "#26A699", size = 2) +
# Add the line plot for weekend trips
geom_line(data = hourly_trips %>% filter(final_day_type == "Weekend"), aes(x = hour, y = number_of_trips), color = "#A6123A", size = 1) +
geom_point(data = hourly_trips %>% filter(final_day_type == "Weekend"), aes(x = hour, y = number_of_trips), color = "#26A699", size = 2) +
# Define scales and labels
scale_x_continuous(breaks = 0:23) +
scale_y_continuous(labels = comma) +
scale_fill_manual(values = service_period_colors) +
scale_color_manual(values = c("Weekday" = "#F2AE2E",
"Weekend" = "#A6123A")) +
labs(title = "Number of Trips by Hour with Service Periods",
x = "Hour of Day",
y = "Number of Trips",
fill = "Service Period",
color = "Day Type") +
theme_minimal() +
theme(legend.position = "bottom",
plot.title = element_text(hjust = 0.5))

While bus services are integral to Philadelphia’s transit system, not
all areas benefit equally. Data specifically pertaining to suburban
regions during weekday AM and PM peak hours reveals a stark disparity:
most suburban bus stations face waiting times ranging from 30 to 60
minutes between buses. This significant wait time not only underscores
the challenges faced by commuters like Midge but also highlights a
broader issue affecting many suburban residents across Southeastern
Pennsylvania, who rely heavily on bus transit.
# SET UP STOPS CATEGORY. This step is crucial as we'll use stops information and join them for each kind of maps
# Convert stops data to sf object
stops <- stops %>%
st_as_sf(coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)
# Load PA and SEPTA region map for base map
PA <- st_read("data/PaCounty2024_05.geojson")
SEPTA_region_map <- PA[PA$COUNTY_NAM %in% c("MONTGOMERY", "BUCKS", "PHILADELPHIA", "DELAWARE", "CHESTER"), ] %>%
select(PA_CTY_COD, COUNTY_NAM, FIPS_COUNT)
# Load Center City boundary
center_city <- st_read("data/20240708_CenterCity/CenterCity.shp")
# Ensure all datasets are in the same CRS
SEPTA_region_map <- st_transform(SEPTA_region_map, crs = st_crs(stops))
center_city <- st_transform(center_city, crs = st_crs(stops))
# Perform the spatial intersection for county
stops <- stops %>%
st_join(SEPTA_region_map, left = TRUE, join = st_intersects) %>%
rename(county = COUNTY_NAM) %>%
mutate(center_city = st_within(stops, center_city, sparse = FALSE))%>%
mutate(center_city = rowSums(center_city) > 0) # Convert logical matrix to vector
# Categorize the stops
stops <- stops %>%
mutate(category = case_when(
county == "PHILADELPHIA" & center_city == TRUE ~ "PHILADELPHIA-Center_City",
county == "PHILADELPHIA" & center_city == FALSE ~ "PHILADELPHIA-Not_Center_City",
TRUE ~ county
)) %>%
select(-"center_city")
# SET UP HOURLY AVERAGE TRIP DATA FOR EACH BUS STOP
dat_trip_count_hr <- dat %>%
filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
group_by(stop_id, mode, final_day_type) %>%
summarise(trip_count = n(),
All_Routes = paste(unique(route_id), collapse = ", "),
.groups = 'drop') %>%
mutate(avg_trip_hr = round(trip_count/24)) %>%
mutate(log_avg_trip_hr = round(log1p(avg_trip_hr), digits = 0)) %>%
mutate(max_freq = ifelse(avg_trip_hr >= 12, "5 MAX",
ifelse (avg_trip_hr >= 6, "10 MAX",
ifelse (avg_trip_hr >= 4, "15 MAX",
ifelse(avg_trip_hr >= 2, "30 MAX", "60 MAX")))))
dat_trip_count_hr <- left_join(dat_trip_count_hr, stops, by = "stop_id")
dat_trip_count_hr <- st_as_sf(dat_trip_count_hr, coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)
# sum(dat_trip_count_hr$trip_count) #1980505 <- Always make sure you retain the number of observation
# SET UP AVERAGE TRIP DATA PER SERVICE PERIOD FOR EACH BUS STOP
dat_trip_count_prd <- dat %>%
filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
group_by(stop_id, peak_category, mode, final_day_type) %>%
summarise(trip_count = n(),
All_Routes = paste(unique(route_id), collapse = ", "),
.groups = 'drop') %>%
mutate(
hours = durations[peak_category],
avg_trip_hr = round(trip_count / hours)
) %>%
select(-hours) %>%
mutate(log_avg_trip_hr = round(log1p(avg_trip_hr), digits = 0)) %>%
mutate(max_freq = ifelse(avg_trip_hr >= 12, "5 MAX",
ifelse (avg_trip_hr >= 6, "10 MAX",
ifelse (avg_trip_hr >= 4, "15 MAX",
ifelse(avg_trip_hr >= 2, "30 MAX", "60 MAX")))))
dat_trip_count_prd <- left_join(dat_trip_count_prd, stops, by = "stop_id")
dat_trip_count_prd <- st_as_sf(dat_trip_count_prd, coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)
# sum(dat_trip_count_prd$trip_count) #1980505 <- Always make sure you retain the number of observation
SEPTA_region_map <- st_transform(SEPTA_region_map, st_crs(dat_trip_count_hr))
# # Define color palette
# pal <- colorNumeric(palette = viridisLite::viridis(256, option = "C"), domain = dat_trip_count_hr$avg_trip_hr)
# Define the color codes for bus revolution and reverse them
bus_revolution_colors <- rev(c("#A6123A", "#655BA6", "#26A699", "#F2AE2E", "#F2E5D5"))
# Ensure max_freq is a factor with levels matching the colors
dat_trip_count_hr$max_freq <- factor(dat_trip_count_hr$max_freq, levels = c("5 MAX", "10 MAX", "15 MAX", "30 MAX", "60 MAX"))
# Define the color palette for max_freq
max_freq_colors <- c("5 MAX" = "#4B0082", "10 MAX" = "#FF4500", "15 MAX" = "#A6123A", "30 MAX" = "#26A699", "60 MAX" = "#F2AE2E")
pal_max_freq <- colorFactor(palette = max_freq_colors, domain = dat_trip_count_hr$max_freq)
# # Generate the list for overlayGroups with the desired order
# overlay_groups <- expand.grid(final_day_type = c("Weekday", "Weekend"), peak_category = peak_order) %>%
# mutate(group_name = paste(final_day_type, peak_category, sep = "_")) %>%
# pull(group_name)
# Create the leaflet map
m_avg_trip_hr <- leaflet() %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(data = SEPTA_region_map, color = "black", weight = 1, fill = FALSE) %>%
addCircleMarkers(data = dat_trip_count_hr %>% filter(final_day_type == "Weekday"),
group = "Weekday",
lng = ~stop_lon, lat = ~stop_lat,
color = ~pal_max_freq(max_freq),
radius = 1,
opacity = 0.8,
popup = ~paste0("<b>Stop ID:</b> ", stop_id, "<br>",
"<b>Stop Name:</b> ", stop_name, "<br>",
"<b>Avg Buses/Hr:</b> ", avg_trip_hr, "<br>",
"<b>Max Frequency:</b> ", max_freq, "<br>",
"<b>All Routes:</b> ", All_Routes, "<br>",
"<b>Day Type:</b> ", final_day_type)) %>%
addCircleMarkers(data = dat_trip_count_hr %>% filter(final_day_type == "Weekend"),
group = "Weekend",
lng = ~stop_lon, lat = ~stop_lat,
color = ~pal_max_freq(max_freq),
radius = 1,
opacity = 0.8,
popup = ~paste0("<b>Stop ID:</b> ", stop_id, "<br>",
"<b>Stop Name:</b> ", stop_name, "<br>",
"<b>Avg Buses/Hr:</b> ", avg_trip_hr, "<br>",
"<b>Max Frequency:</b> ", max_freq, "<br>",
"<b>All Routes:</b> ", All_Routes, "<br>",
"<b>Day Type:</b> ", final_day_type)) %>%
addLayersControl(
overlayGroups = c("Weekday", "Weekend"),
options = layersControlOptions(collapsed = FALSE)
) %>%
hideGroup(c("Weekend")) %>%
addLegend(pal = pal_max_freq, values = dat_trip_count_hr$max_freq, title = "Max Frequency",
position = "bottomright")
m_avg_trip_hr
Further analysis of bus stop data by county—excluding Philadelphia’s
Center City—reveals a misleading equality in service distribution across
the counties. While the top bus stops in each county boast a 5-minute
maximum headway, reflecting seemingly equal service levels, the actual
number of buses tells a different story. For instance, even at their
busiest, bus stops in Bucks County see an average of only 15 buses per
hour at the 5 MAX category, starkly less than the 30 buses seen in
similar categories in Philadelphia’s outlying areas. This discrepancy
highlights the concentration of better services still within closer
proximity to urban centers, despite attempts at equitable
distribution.
# Filter and select the top 10 stops for 'mode' == 'bus' on the weekday
top_stops_hr_weekday <- dat_trip_count_hr %>%
filter(mode %in% c("Bus","Trolleybus"), final_day_type == "Weekday") %>%
select(stop_id, stop_name, avg_trip_hr, All_Routes) %>%
arrange(desc(avg_trip_hr)) %>%
slice_head(n = 10)
# Create a table with kable and enhance it with kableExtra
kable(top_stops_hr_weekday, "html", booktabs = TRUE,
caption = "Top 10 Bus Stops by Average Trips Per Hour on the Weekday",
align = c('l', rep('l', ncol(top_stops_hr_weekday) - 1))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
column_spec(1, bold = TRUE, color = "#655BA6", width = "5em") %>%
column_spec(3, bold = TRUE, color = "#A6123A", extra_css = "text-align: right;")
Top 10 Bus Stops by Average Trips Per Hour on the Weekday
|
stop_id
|
stop_name
|
avg_trip_hr
|
All_Routes
|
geometry
|
|
21204
|
Frankford Transportation Center - Main Dropoff
|
47
|
14, 19, 20, 24, 26, 3, 50, 58, 67, 88, BLVDDIR, MFO
|
POINT (-75.07748 40.02329)
|
|
1148
|
69th St Transportation Center South Terminal
|
44
|
103, 104, 105, 107, 108, 109, 111, 113, 110, 120, 21, 30, 65, 68, MFO
|
POINT (-75.25828 39.96208)
|
|
10266
|
Market St & 15th St
|
40
|
124, 125, 17, 31, 32, 33, 38, 44, 48, 62, BSO, MFO
|
POINT (-75.16548 39.95255)
|
|
10272
|
Market St & 18th St
|
36
|
124, 125, 17, 31, 32, 33, 38, 44, 48, 62
|
POINT (-75.17017 39.95312)
|
|
18451
|
Market St & 16th St
|
36
|
124, 125, 17, 31, 32, 33, 38, 44, 48, 62
|
POINT (-75.16705 39.95274)
|
|
136
|
Cheltenham Av & Ogontz Av Loop
|
34
|
16, 6, H, XH
|
POINT (-75.1591 40.07483)
|
|
17842
|
JFK Blvd & 15th St
|
34
|
MANNLP, 124, 125, 17, 27, 31, 32, 33, 38, 44, 446, 62, 78
|
POINT (-75.16476 39.95365)
|
|
8936
|
JFK Blvd & 17th St
|
31
|
MANNLP, 124, 125, 17, 31, 32, 33, 38, 44, 446, 62
|
POINT (-75.16806 39.95406)
|
|
101
|
Pier 70 WalMart & Home Depot
|
29
|
25, 29, 64, 7
|
POINT (-75.14167 39.92531)
|
|
15497
|
69th St Transit Center
|
29
|
104, 109, 110, 111, 112, 120, 123, 126
|
POINT (-75.25968 39.96217)
|
# Define a custom palette for the categories
category_colors <- c(
"BUCKS" = "#A9D08E",
"CHESTER" = "#FFD966",
"DELAWARE" = "#F4B084",
"MONTGOMERY" = "#D5A6BD",
"PHILADELPHIA-Not_Center_City" = "#A4C2F4"
)
# Example with Weekday data
dat_filtered_weekday <- dat_trip_count_hr %>%
filter(mode %in% c("Bus","Trolleybus"), final_day_type == "Weekday" & category != "PHILADELPHIA-Center_City")
# Function to get top 5 stops per category
get_top_stops_per_category <- function(data, n = 5) {
data %>%
group_by(category) %>%
select(stop_id, stop_name, avg_trip_hr, max_freq, All_Routes, category) %>%
arrange(desc(avg_trip_hr)) %>%
slice_head(n = n) %>%
ungroup()
}
# Get top 5 stops per category for Weekday
top_stops_hr_weekday <- get_top_stops_per_category(dat_filtered_weekday, n = 5)
# Create the table for Weekday with row colors
kable(top_stops_hr_weekday, "html", booktabs = TRUE,
caption = "Top 5 Bus Stops by Average Trips Per Hour on the Weekday",
align = c('l', rep('l', ncol(top_stops_hr_weekday) - 1))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
column_spec(1, bold = TRUE, color = "#655BA6", width = "5em") %>%
column_spec(3, bold = TRUE, color = "#A6123A", extra_css = "text-align: right;")
Top 5 Bus Stops by Average Trips Per Hour on the Weekday
|
stop_id
|
stop_name
|
avg_trip_hr
|
max_freq
|
All_Routes
|
category
|
geometry
|
|
330
|
Neshaminy Mall
|
16
|
5 MAX
|
128, 130, 14, 58, BLVDDIR
|
BUCKS
|
POINT (-74.95357 40.13806)
|
|
23080
|
Rockhill Rd & Neshaminy Rd - MBFS
|
14
|
5 MAX
|
128, 130, 14, 58, BLVDDIR
|
BUCKS
|
POINT (-74.95696 40.13665)
|
|
23071
|
Rockhill Rd & Neshaminy Blvd - MBNS
|
13
|
5 MAX
|
14, 58, BLVDDIR
|
BUCKS
|
POINT (-74.95725 40.13658)
|
|
22762
|
Roosevelt Blvd & Old Lincoln Hwy - MBFS
|
12
|
5 MAX
|
1, 14, BLVDDIR
|
BUCKS
|
POINT (-74.97913 40.12061)
|
|
25881
|
Roosevelt Blvd & Old Lincoln Hwy - MBNS
|
12
|
5 MAX
|
1, 14, BLVDDIR
|
BUCKS
|
POINT (-74.97909 40.12022)
|
|
1990
|
West Chester Transportation Center
|
8
|
10 MAX
|
135, 92, 104
|
CHESTER
|
POINT (-75.6073 39.9579)
|
|
18729
|
Chestnut St & Church St
|
6
|
10 MAX
|
135, 92, 104
|
CHESTER
|
POINT (-75.60707 39.96102)
|
|
18730
|
Chestnut St & Darlington St
|
6
|
10 MAX
|
135, 92, 104
|
CHESTER
|
POINT (-75.60838 39.96047)
|
|
30146
|
New St & Chestnut St - FS
|
6
|
10 MAX
|
135, 92, 104
|
CHESTER
|
POINT (-75.60984 39.95955)
|
|
14820
|
Church St & University Av - FS
|
5
|
15 MAX
|
104
|
CHESTER
|
POINT (-75.6 39.95252)
|
|
1148
|
69th St Transportation Center South Terminal
|
44
|
5 MAX
|
103, 104, 105, 107, 108, 109, 111, 113, 110, 120, 21, 30, 65, 68, MFO
|
DELAWARE
|
POINT (-75.25828 39.96208)
|
|
15497
|
69th St Transit Center
|
29
|
5 MAX
|
104, 109, 110, 111, 112, 120, 123, 126
|
DELAWARE
|
POINT (-75.25968 39.96217)
|
|
2024
|
69th St Transportation Center North Terminal
|
25
|
5 MAX
|
103, 105, 106, 123, 30, 65
|
DELAWARE
|
POINT (-75.25861 39.96282)
|
|
14785
|
Chester Transportation Center
|
20
|
5 MAX
|
117, 119, 113, 37
|
DELAWARE
|
POINT (-75.35997 39.84921)
|
|
30597
|
Darby Transportation Center - bus
|
17
|
5 MAX
|
114, 115, 113
|
DELAWARE
|
POINT (-75.26358 39.9191)
|
|
136
|
Cheltenham Av & Ogontz Av Loop
|
34
|
5 MAX
|
16, 6, H, XH
|
MONTGOMERY
|
POINT (-75.1591 40.07483)
|
|
1649
|
Norristown Transportation Center
|
22
|
5 MAX
|
131, 90, 93, 96, 97, 98, 99
|
MONTGOMERY
|
POINT (-75.34482 40.11345)
|
|
809
|
Willow Grove Park Mall
|
20
|
5 MAX
|
310, 311, 95, 22, 55
|
MONTGOMERY
|
POINT (-75.12249 40.14226)
|
|
1827
|
Plaza at King of Prussia
|
19
|
5 MAX
|
124, 139, 92, 99, 123, 125
|
MONTGOMERY
|
POINT (-75.39311 40.08983)
|
|
246
|
Plymouth Meeting Mall
|
16
|
5 MAX
|
150, 90, 95, 98, 27, L
|
MONTGOMERY
|
POINT (-75.28403 40.1172)
|
|
21204
|
Frankford Transportation Center - Main Dropoff
|
47
|
5 MAX
|
14, 19, 20, 24, 26, 3, 50, 58, 67, 88, BLVDDIR, MFO
|
PHILADELPHIA-Not_Center_City
|
POINT (-75.07748 40.02329)
|
|
101
|
Pier 70 WalMart & Home Depot
|
29
|
5 MAX
|
25, 29, 64, 7
|
PHILADELPHIA-Not_Center_City
|
POINT (-75.14167 39.92531)
|
|
28
|
Wissahickon Transportation Center - onsite
|
28
|
5 MAX
|
124, 125, 1, 35, 38, 61, R
|
PHILADELPHIA-Not_Center_City
|
POINT (-75.20724 40.0149)
|
|
8712
|
Hunting Park Av & 22nd St
|
27
|
5 MAX
|
33, 445, 447, 56, H, R, XH
|
PHILADELPHIA-Not_Center_City
|
POINT (-75.16457 40.01089)
|
|
410
|
Philadelphia Mills & Marshalls
|
24
|
5 MAX
|
129, 130, 20, 50, 67, 84
|
PHILADELPHIA-Not_Center_City
|
POINT (-74.96447 40.08583)
|
---
title: "Waiting for Change: The Urgent Need for Equitable Bus Services in Suburban Philadelphia"
author: "Regy Septian"
date: "2024-09-24"
output: 
  html_document:
    toc: true
    toc_float: true
    code_folding: "hide"
    code_download: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, cache = FALSE)

library(tidyverse)
library(sf)
library(tidytransit)
library(lubridate)
library(hms)
library(tidyr)
library(patchwork)
library(kableExtra)
library(wk)
library(leaflet)
library(leaflet.extras)
library(viridis)
library(viridisLite)
library(htmlwidgets)
library(htmltools)
library(leaflet.extras)
library(forcats)
library(ggplot2)
library(scales)
library(shiny)
library(webshot2)
library(magick)

# Define the color codes
bus_revolution_colors <- c("#A6123A", "#655BA6", "#26A699", "#F2AE2E", "#F2E5D5")

# Create a custom palette function
bus_revolution_palette <- function(n) {
  colorRampPalette(bus_revolution_colors)(n)
}

workdir <- "~/GitHub/MUSA-6310_Assignment-2" # Change the directory
setwd(workdir)
```
### A Commuter's Challenge in Suburban Philadelphia  

Every morning, Midge, a recent graduate working in downtown Philadelphia, starts her day battling the limitations of suburban bus transit. With services far less frequent than those in urban centers, her story illustrates the broader struggles faced by many suburban commuters, dependent on a system that doesn't meet their needs.  

### Urban Efficiency vs. Suburban Scarcity  

Using the latest SEPTA General Transit Feed Specification (GTFS) data[^1], it reveals that a staggering 80% of SEPTA's trips are made by bus, showcasing the critical role of bus transit in Philadelphia’s transportation network. This statistic underlines the heavy reliance on buses across the city, not just in suburban areas. 

[^1]: [SEPTA GTFS](https://www3.septa.org/developer/gtfs_public.zip)


```{r import gtfs, message=FALSE, warning=FALSE, cache=TRUE, results = 'hide'}
# Import GTFS data
stops <- read.csv("data/google_bus/stops.txt", header = TRUE, sep = ",")
stop_times <- read.csv("data/google_bus/stop_times.txt", header = TRUE, sep = ",")
trips <- read.csv("data/google_bus/trips.txt", header = TRUE, sep = ",")
shapes <- read.csv("data/google_bus/shapes.txt", header = TRUE, sep = ",")
routes <- read.csv("data/google_bus/routes.txt", header = TRUE, sep = ",")
calendar <- read.csv("data/google_bus/calendar.txt", header = TRUE, sep = ",")
calendar_dates <- read.csv("data/google_bus/calendar_dates.txt", header = TRUE, sep = ",")

# Import Swiftly data
swiftly <- st_read("data/SwiftlyShapes/geoJSONs/SEPTASurface_HighResSpeed_v2.json")

```

```{r total trip, message=FALSE, warning=FALSE, cache=TRUE}
# Join stop_times and trips, from now on dat will be our main data set
dat <- left_join(stop_times, trips, by = "trip_id")

# join dat and routes
dat <- left_join(dat, routes, by = "route_id")

#determine route_type
dat <- dat %>%
  mutate(mode = if_else(route_type == 0, "Trolley", ifelse(route_type == 1, "Metro", ifelse(route_type == 3, "Bus", "Trolleybus"))))

# Mode share (not ridership)
dat_trip_by_mode <- dat %>%
  group_by(mode) %>%
  summarise(mode_count = n(), .groups = 'drop') %>%
  mutate(total_count = sum(mode_count),
         mode_share = round(mode_count/sum(mode_count)*100, digits = 2)) %>%
  ungroup() 
  
# Plot mode share using ggplot2 and the custom palette
ggplot(dat_trip_by_mode, aes(x = mode, y = mode_share, fill = mode)) +
  geom_bar(stat = "identity", width = 0.75) +
  scale_fill_manual(values = bus_revolution_palette(length(unique(dat_trip_by_mode$mode)))) +
  geom_text(aes(label = paste0(mode_share, "% (", mode_count, " trips)")), 
            vjust = -0.5, color = "black", size = 2.5) +
  labs(title = "Mode Share by Transit Type",
       x = "Mode",
       y = "Mode Share (%)",
       fill = "Mode") +
  theme_minimal() +
  theme(legend.position = "right",
        plot.title = element_text(hjust = 0.5))

# total trip bus & trolley bus 1.980.505, total observations 2.026.677
```

The data also highlights distinct patterns in ridership, with bus usage peaking during morning (6-9AM) and evening (3-6PM) rush hours on weekdays. This contrasts with weekends, where the pattern flattens, indicating a more even distribution of trips throughout the day.
  
```{r peak time, message=FALSE, warning=FALSE, cache=TRUE, results = 'hide'}
# CREATE LINE GRAPH OF TRIPS BY SERVICE PERIODS

# Function to normalize times greater than 24:00:00. Important step: check the stop_times$arrival_time and departure_time values if it's greater than 24:00 
normalize_time <- function(time_str) {
  time_parts <- strsplit(time_str, ":")[[1]]
  hours <- as.numeric(time_parts[1])
  minutes <- as.numeric(time_parts[2])
  seconds <- as.numeric(time_parts[3])
  
  if (hours >= 24) {
    hours <- hours - 24
    time_str <- sprintf("%02d:%02d:%02d", hours, minutes, seconds)
  }
  
  return(time_str)
}

#identify peak_pm category 
dat <- dat %>%
  mutate(departure_time = sapply(departure_time, normalize_time), # departure_time's normalized
         departure_time = lubridate::hms(departure_time),
         hour = hour(departure_time),  # Extract hour component for easier condition checking
         peak_category = if_else(hour >= 4 & hour < 6, "Early Morning",
                                    if_else(hour >= 6 & hour < 9, "AM Peak",
                                            if_else(hour >= 9 & hour < 15, "Mid Day",
                                                    if_else(hour >= 15 & hour < 18, "PM Peak",
                                                            if_else(hour >= 18 & hour < 22, "Evening", 
                                                                    ifelse(hour >= 22 & hour < 24, "Late Night", "Owl"))))))) 

# # Check if there are any NA values in the 'departure_time' column
# summarise(dat,
#     na_count = sum(is.na(departure_time)), #Count NA values
#     total_rows = n(),  # Total number of rows for context
#     percentage_na = na_count / total_rows * 100  # Percentage of NA values
#   )

durations <- c("Owl" = 4, "Early Morning" = 2, "AM Peak" = 3, "Mid Day" = 6, "PM Peak" = 3, "Evening" = 4, "Late Night" = 2)

dat_service_period <- dat %>%
  filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
  group_by(peak_category)%>%
  summarise(peak_count = n(), .groups = 'drop') %>%
  mutate(total_count = sum(peak_count),
         peak_share = round(peak_count/sum(peak_count)*100, digits = 2)) %>%
  arrange(factor(peak_category, levels = c("Owl", "Early Morning", "AM Peak", "Mid Day", "PM Peak", "Evening", "Late Night"))) %>%
  ungroup()

# Create a data frame for service periods
service_periods <- data.frame(
  peak_category = c("Owl", "Early Morning", "AM Peak", "Mid Day", "PM Peak", "Evening", "Late Night"),
  start_hour = c(0, 4, 6, 9, 15, 18, 22),
  end_hour = c(4, 6, 9, 15, 18, 22, 24)
)

# CREATE BAR CHART OF TRIPS BY SERVICE TYPES

# Ensure 'calendar' has 'day_type' defined
calendar <- calendar %>%
  mutate(
    day_type = case_when(
      monday == 1 & tuesday == 1 & wednesday == 1 & thursday == 1 & friday == 1 & saturday == 0 & sunday == 0 ~ "Weekday",
      (saturday == 1 | sunday == 1) ~ "Weekend",
      TRUE ~ "irregular"
    )
  )

# Calendar_dates includes 'service_id', 'date', and 'exception_type'
# Ensure 'calendar_dates' has 'day_type' defined based on the date
calendar_dates <- calendar_dates %>%
  filter(exception_type == 1) %>% #we only need exception_type == 1, those added to the service
  mutate(
    date = ymd(date),  # Convert date format if necessary
    day_of_week = wday(date, label = TRUE, week_start = 1),
    day_type = case_when(
      day_of_week %in% c("Mon", "Tue", "Wed", "Thu", "Fri") ~ "Weekday",  # Monday to Friday
      TRUE ~ "Weekend"  # Saturday and Sunday
    )
  )

# Join calendar_dates with calendar
calendar_joined <- calendar_dates %>%
  full_join(calendar, by = "service_id", suffix = c(".dates", ".cal")) %>%
  mutate(
    final_day_type = case_when(
      # Use calendar_dates' day_type if exception_type indicates added service
      exception_type == 1 & day_type.dates == "Weekday" ~ "Weekday",
      exception_type == 1 & day_type.dates == "Weekend" ~ "Weekend",
      # Default to calendar's day_type where there's no specific exception noted
      TRUE ~ day_type.cal
    )
  ) %>%
  select(service_id, final_day_type) %>%
  group_by(service_id, final_day_type) %>%
  summarise(count = n(), .groups = 'drop') %>%
  arrange(service_id, desc(count)) %>%
  distinct(service_id, .keep_all = TRUE) %>% # Make sure no double day types for service ID
  select(-count)
  
dat <- dat %>%
  left_join(calendar_joined, by = "service_id")

dat_service_type <- dat %>%
  filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
  group_by(final_day_type)%>%
  summarise(day_type_count = n(), .groups = 'drop') %>%
  mutate(total_count = sum(day_type_count),
         peak_share = round(day_type_count/sum(day_type_count)*100, digits = 2)) %>%
  arrange(factor(final_day_type, levels = c("Weekday", "Weekend"))) %>%
  ungroup()

# Summarize data to get the number of trips per hour
hourly_trips <- dat %>%
  filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
  group_by(hour, final_day_type) %>%
  summarise(number_of_trips = n())

# Define a custom palette for the service periods
service_period_colors <- c(
  "Owl" = "#E5E5E5",
  "Early Morning" = "#C5E0B4",
  "AM Peak" = "#A9D08E",
  "Mid Day" = "#FFD966",
  "PM Peak" = "#F4B084",
  "Evening" = "#D5A6BD",
  "Late Night" = "#A4C2F4"
)

# Plot the combined graph
ggplot() +
  # Add service period background rectangles
  geom_rect(data = service_periods, aes(xmin = start_hour, xmax = end_hour, ymin = 0, ymax = max(hourly_trips$number_of_trips), fill = peak_category), alpha = 0.3) +
  # Add the line plot for weekday trips
  geom_line(data = hourly_trips %>% filter(final_day_type == "Weekday"), aes(x = hour, y = number_of_trips), color = "#F2AE2E", size = 1) +
  geom_point(data = hourly_trips %>% filter(final_day_type == "Weekday"), aes(x = hour, y = number_of_trips), color = "#26A699", size = 2) +
  # Add the line plot for weekend trips
  geom_line(data = hourly_trips %>% filter(final_day_type == "Weekend"), aes(x = hour, y = number_of_trips), color = "#A6123A", size = 1) +
  geom_point(data = hourly_trips %>% filter(final_day_type == "Weekend"), aes(x = hour, y = number_of_trips), color = "#26A699", size = 2) +
  # Define scales and labels
  scale_x_continuous(breaks = 0:23) +
  scale_y_continuous(labels = comma) +
  scale_fill_manual(values = service_period_colors) +
  scale_color_manual(values = c("Weekday" = "#F2AE2E", 
                                "Weekend" = "#A6123A")) +
  labs(title = "Number of Trips by Hour with Service Periods",
       x = "Hour of Day",
       y = "Number of Trips",
       fill = "Service Period",
       color = "Day Type") +
  theme_minimal() +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5))
  
```

While bus services are integral to Philadelphia's transit system, not all areas benefit equally. Data specifically pertaining to suburban regions during weekday AM and PM peak hours reveals a stark disparity: most suburban bus stations face waiting times ranging from 30 to 60 minutes between buses. This significant wait time not only underscores the challenges faced by commuters like Midge but also highlights a broader issue affecting many suburban residents across Southeastern Pennsylvania, who rely heavily on bus transit.  

```{r set up stops category, message=FALSE, warning=FALSE, cache=TRUE, results = 'hide'}
# SET UP STOPS CATEGORY. This step is crucial as we'll use stops information and join them for each kind of maps 

# Convert stops data to sf object
stops <- stops %>%
  st_as_sf(coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)

# Load PA and SEPTA region map for base map
PA <- st_read("data/PaCounty2024_05.geojson")
SEPTA_region_map <- PA[PA$COUNTY_NAM %in% c("MONTGOMERY", "BUCKS", "PHILADELPHIA", "DELAWARE", "CHESTER"), ] %>%
  select(PA_CTY_COD, COUNTY_NAM, FIPS_COUNT)

# Load Center City boundary
center_city <- st_read("data/20240708_CenterCity/CenterCity.shp")

# Ensure all datasets are in the same CRS
SEPTA_region_map <- st_transform(SEPTA_region_map, crs = st_crs(stops))
center_city <- st_transform(center_city, crs = st_crs(stops))

# Perform the spatial intersection for county
stops <- stops %>%
  st_join(SEPTA_region_map, left = TRUE, join = st_intersects) %>%
  rename(county = COUNTY_NAM) %>%
  mutate(center_city = st_within(stops, center_city, sparse = FALSE))%>%
  mutate(center_city = rowSums(center_city) > 0)  # Convert logical matrix to vector

# Categorize the stops
stops <- stops %>%
  mutate(category = case_when(
    county == "PHILADELPHIA" & center_city == TRUE ~ "PHILADELPHIA-Center_City",
    county == "PHILADELPHIA" & center_city == FALSE ~ "PHILADELPHIA-Not_Center_City",
    TRUE ~ county
  )) %>%
  select(-"center_city")
  
```

```{r hourly stop data, message=FALSE, warning=FALSE, cache=TRUE, results = 'hide'}
# SET UP HOURLY AVERAGE TRIP DATA FOR EACH BUS STOP

dat_trip_count_hr <- dat %>%
  filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
  group_by(stop_id, mode, final_day_type) %>%
  summarise(trip_count = n(), 
            All_Routes = paste(unique(route_id), collapse = ", "),
            .groups = 'drop') %>%
  mutate(avg_trip_hr = round(trip_count/24)) %>%
  mutate(log_avg_trip_hr = round(log1p(avg_trip_hr), digits = 0)) %>%
  mutate(max_freq = ifelse(avg_trip_hr >= 12, "5 MAX",
                           ifelse (avg_trip_hr >= 6, "10 MAX",
                           ifelse (avg_trip_hr >= 4, "15 MAX", 
                           ifelse(avg_trip_hr >= 2, "30 MAX", "60 MAX")))))

dat_trip_count_hr <- left_join(dat_trip_count_hr, stops, by = "stop_id")

dat_trip_count_hr <- st_as_sf(dat_trip_count_hr, coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)

# sum(dat_trip_count_hr$trip_count) #1980505 <- Always make sure you retain the number of observation

```

```{r service period stop data, message=FALSE, warning=FALSE, cache=TRUE, results = 'hide'}
# SET UP AVERAGE TRIP DATA PER SERVICE PERIOD FOR EACH BUS STOP

dat_trip_count_prd <- dat %>%
  filter(mode %in% c("Bus", "Trolleybus", "Trolley")) %>%
  group_by(stop_id, peak_category, mode, final_day_type) %>%
  summarise(trip_count = n(), 
            All_Routes = paste(unique(route_id), collapse = ", "),
            .groups = 'drop') %>%
  mutate(
    hours = durations[peak_category],
    avg_trip_hr = round(trip_count / hours)
  ) %>%
  select(-hours) %>%
  mutate(log_avg_trip_hr = round(log1p(avg_trip_hr), digits = 0)) %>%
  mutate(max_freq = ifelse(avg_trip_hr >= 12, "5 MAX",
                           ifelse (avg_trip_hr >= 6, "10 MAX",
                           ifelse (avg_trip_hr >= 4, "15 MAX", 
                           ifelse(avg_trip_hr >= 2, "30 MAX", "60 MAX")))))

dat_trip_count_prd <- left_join(dat_trip_count_prd, stops, by = "stop_id")

dat_trip_count_prd <- st_as_sf(dat_trip_count_prd, coords = c("stop_lon", "stop_lat"), crs = 4326, remove = FALSE)

# sum(dat_trip_count_prd$trip_count)  #1980505 <- Always make sure you retain the number of observation
```

```{r average bus hourly maxfreq, message=FALSE, warning=FALSE, cache=FALSE}
SEPTA_region_map <- st_transform(SEPTA_region_map, st_crs(dat_trip_count_hr))

# # Define color palette
# pal <- colorNumeric(palette = viridisLite::viridis(256, option = "C"), domain = dat_trip_count_hr$avg_trip_hr)

# Define the color codes for bus revolution and reverse them
bus_revolution_colors <- rev(c("#A6123A", "#655BA6", "#26A699", "#F2AE2E", "#F2E5D5"))

# Ensure max_freq is a factor with levels matching the colors
dat_trip_count_hr$max_freq <- factor(dat_trip_count_hr$max_freq, levels = c("5 MAX", "10 MAX", "15 MAX", "30 MAX", "60 MAX"))

# Define the color palette for max_freq
max_freq_colors <- c("5 MAX" = "#4B0082", "10 MAX" = "#FF4500", "15 MAX" = "#A6123A", "30 MAX" = "#26A699", "60 MAX" = "#F2AE2E")
pal_max_freq <- colorFactor(palette = max_freq_colors, domain = dat_trip_count_hr$max_freq)

# # Generate the list for overlayGroups with the desired order
# overlay_groups <- expand.grid(final_day_type = c("Weekday", "Weekend"), peak_category = peak_order) %>%
#   mutate(group_name = paste(final_day_type, peak_category, sep = "_")) %>%
#   pull(group_name)

# Create the leaflet map
m_avg_trip_hr <- leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data = SEPTA_region_map, color = "black", weight = 1, fill = FALSE) %>%
  addCircleMarkers(data = dat_trip_count_hr %>% filter(final_day_type == "Weekday"),
                   group = "Weekday",
                   lng = ~stop_lon, lat = ~stop_lat,
                   color = ~pal_max_freq(max_freq),
                   radius = 1,
                   opacity = 0.8,
                   popup = ~paste0("<b>Stop ID:</b> ", stop_id, "<br>",
                                   "<b>Stop Name:</b> ", stop_name, "<br>",
                                   "<b>Avg Buses/Hr:</b> ", avg_trip_hr, "<br>",
                                   "<b>Max Frequency:</b> ", max_freq, "<br>",
                                   "<b>All Routes:</b> ", All_Routes, "<br>",
                                   "<b>Day Type:</b> ", final_day_type)) %>%
  addCircleMarkers(data = dat_trip_count_hr %>% filter(final_day_type == "Weekend"),
                   group = "Weekend",
                   lng = ~stop_lon, lat = ~stop_lat,
                   color = ~pal_max_freq(max_freq),
                   radius = 1,
                   opacity = 0.8,
                   popup = ~paste0("<b>Stop ID:</b> ", stop_id, "<br>",
                                   "<b>Stop Name:</b> ", stop_name, "<br>",
                                   "<b>Avg Buses/Hr:</b> ", avg_trip_hr, "<br>",
                                   "<b>Max Frequency:</b> ", max_freq, "<br>",
                                   "<b>All Routes:</b> ", All_Routes, "<br>",
                                   "<b>Day Type:</b> ", final_day_type)) %>%
  addLayersControl(
    overlayGroups = c("Weekday", "Weekend"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>%
  hideGroup(c("Weekend")) %>%
  addLegend(pal = pal_max_freq, values = dat_trip_count_hr$max_freq, title = "Max Frequency",
            position = "bottomright")

m_avg_trip_hr
```

Further analysis of bus stop data by county—excluding Philadelphia's Center City—reveals a misleading equality in service distribution across the counties. While the top bus stops in each county boast a 5-minute maximum headway, reflecting seemingly equal service levels, the actual number of buses tells a different story. For instance, even at their busiest, bus stops in Bucks County see an average of only 15 buses per hour at the 5 MAX category, starkly less than the 30 buses seen in similar categories in Philadelphia’s outlying areas. This discrepancy highlights the concentration of better services still within closer proximity to urban centers, despite attempts at equitable distribution.

```{r top 10 stops table, message=FALSE, warning=FALSE, cache=TRUE}
# Filter and select the top 10 stops for 'mode' == 'bus' on the weekday
top_stops_hr_weekday <- dat_trip_count_hr %>%
  filter(mode %in% c("Bus","Trolleybus"), final_day_type == "Weekday") %>%
  select(stop_id, stop_name, avg_trip_hr, All_Routes) %>%
  arrange(desc(avg_trip_hr)) %>%
  slice_head(n = 10)

# Create a table with kable and enhance it with kableExtra
kable(top_stops_hr_weekday, "html", booktabs = TRUE, 
      caption = "Top 10 Bus Stops by Average Trips Per Hour on the Weekday",
      align = c('l', rep('l', ncol(top_stops_hr_weekday) - 1))) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE, color = "#655BA6", width = "5em") %>%
  column_spec(3, bold = TRUE, color = "#A6123A", extra_css = "text-align: right;")

```


```{r top 5 stops table, message=FALSE, warning=FALSE, cache=TRUE}
# Define a custom palette for the categories
category_colors <- c(
  "BUCKS" = "#A9D08E",
  "CHESTER" = "#FFD966",
  "DELAWARE" = "#F4B084",
  "MONTGOMERY" = "#D5A6BD",
  "PHILADELPHIA-Not_Center_City" = "#A4C2F4"
)

# Example with Weekday data
dat_filtered_weekday <- dat_trip_count_hr %>%
  filter(mode %in% c("Bus","Trolleybus"), final_day_type == "Weekday" & category != "PHILADELPHIA-Center_City")

# Function to get top 5 stops per category
get_top_stops_per_category <- function(data, n = 5) {
  data %>%
    group_by(category) %>%
    select(stop_id, stop_name, avg_trip_hr, max_freq, All_Routes, category) %>%
    arrange(desc(avg_trip_hr)) %>%
    slice_head(n = n) %>%
    ungroup()
}

# Get top 5 stops per category for Weekday
top_stops_hr_weekday <- get_top_stops_per_category(dat_filtered_weekday, n = 5)

# Create the table for Weekday with row colors
kable(top_stops_hr_weekday, "html", booktabs = TRUE, 
      caption = "Top 5 Bus Stops by Average Trips Per Hour on the Weekday",
      align = c('l', rep('l', ncol(top_stops_hr_weekday) - 1))) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE, color = "#655BA6", width = "5em") %>%
  column_spec(3, bold = TRUE, color = "#A6123A", extra_css = "text-align: right;") 
```

### Impact and Innovation: Enhancing Suburban Bus Services

While the raw data illuminates the service disparities, the true depth of this issue is best captured through the lived experiences of suburban commuters like Midge. Personal stories reveal the broader human impact: missed opportunities, prolonged workdays, and diminished family life due to infrequent bus services. This qualitative insight paints a vivid picture of the daily challenges and underscores the urgency for targeted improvements.

SEPTA's ongoing "Bus Revolution" project aims to modernize and enhance bus services, promising updated routes and improved frequencies that haven't been revised since the 1960s. While the initiative seeks to reduce headways to a maximum of 30 minutes even in the most remote suburban areas, it has faced delays and criticism, particularly concerning its potential to disproportionately impact low-income neighborhoods. This tension highlights the complex interplay between improving service efficiency and addressing community concerns of gentrification.

As we conclude our exploration of suburban bus transit disparities in Philadelphia, the compelling narratives of daily commuters, underpinned by robust SEPTA data, paint a clear picture of the challenges and the urgent need for systemic improvements. The "Bus Revolution" project, while a promising step toward enhancing bus frequencies and updating decades-old routes, also highlights the delicate balance required to improve service efficiency without exacerbating social inequities.

Embracing innovative solutions and fostering community engagement in transit planning are crucial. By committing to both technological upgrades and equitable service distribution, SEPTA can ensure that every commuter, regardless of where they live, has reliable and timely access to public transit. This not only improves individual daily experiences but also supports broader goals of economic mobility and sustainable urban development.

As stakeholders, we are called upon to support these initiatives, advocate for transparent and inclusive planning processes, and ensure that the voices of all commuters, especially those most affected like Midge, are heard and addressed in the pursuit of a truly connected and equitable transit system.